1
Comprehensive or Sampling? — The First Step in Statistical Investigation
MATH701B-PEP-CNLesson 6
00:00
SamplingEstimationPopulationSample
Statistics is the science of collecting, organizing, and analyzing data to make inferences and decisions. Like tasting a pot of eight-treasure rice pudding, you don’t need to drink the entire pot to know its sweetness or saltiness—just stir it well and take a spoonful to ‘see the whole picture from a single spot.’ This is the charm of statistical investigation.

Core Concepts: Who Is Our Main Focus?

Before any investigation, we must clearly define our research subjects:

  • Population (Population): The entire group under examination.
  • Individual: Each individual object that makes up the population.
  • Sample (Sample): A subset of objects drawn from the population.
  • Sample Size (Sample Size): The number of individualsin the sample(Note: It is a number without units).

Choosing the Investigation Method

Why not always conductcomprehensive surveys(surveys that examine every object)?

Scenario A: Census

For example, the sixth national census in 2010. Extremely high accuracy is required, and the data impacts national economy and people's livelihoods—every individual must be accounted for.

Scenario B: Impact Resistance Test

If testing the impact resistance of a batch of cars, a comprehensive survey would mean destroying all new vehicles. In this case,sampling surveys(selecting a portion of objects for investigation and inferring about the whole) is the only viable option.

The Science and Pitfalls of Sampling

To ensure a single spoonful represents the entire pot, we must followsimple random samplingprinciples so that each individual has an equal chance of being selected. We must avoid the following three pitfalls:

  • Too small: The sample size is too small, making results prone to chance and unable to objectively reflect the population.
  • Too large: It defeats the purpose of saving time and effort.
  • Bias: For example, estimating the entire school’s characteristics based only on classmates nearby—this sample lacks representativeness.
🎯 Core Logic
The core of sampling surveys lies in using sample data to infer the overall situation. Its logical formula is: $q \approx \frac{p}{n} \times m$, where $q$ is the estimated value of the population.